65 research outputs found
Interplay between Secondary and Tertiary Structure Formation in Protein Folding Cooperativity
Protein folding cooperativity is defined by the nature of the finite-size
thermodynamic transition exhibited upon folding: two-state transitions show a
free energy barrier between the folded and unfolded ensembles, while downhill
folding is barrierless. A microcanonical analysis, where the energy is the
natural variable, has shown better suited to unambiguously characterize the
nature of the transition compared to its canonical counterpart. Replica
exchange molecular dynamics simulations of a high resolution coarse-grained
model allow for the accurate evaluation of the density of states, in order to
extract precise thermodynamic information, and measure its impact on structural
features. The method is applied to three helical peptides: a short helix shows
sharp features of a two-state folder, while a longer helix and a three-helix
bundle exhibit downhill and two-state transitions, respectively. Extending the
results of lattice simulations and theoretical models, we find that it is the
interplay between secondary structure and the loss of non-native tertiary
contacts which determines the nature of the transition.Comment: 3 pages, 3 figure
Efficient potential of mean force calculation from multiscale simulations: solute insertion in a lipid membrane
The determination of potentials of mean force for solute insertion in a
membrane by means of all-atom molecular dynamics simulations is often hampered
by sampling issues. A multiscale approach to conformational sampling was
recently proposed by Bereau and Kremer (2016). It aims at accelerating the
sampling of the atomistic conformational space by means of a systematic
backmapping of coarse-grained snapshots. In this work, we first analyze the
efficiency of this method by comparing its predictions for propanol insertion
into a 1,2-Dimyristoyl-sn-glycero-3-phosphocholine membrane (DMPC) against
reference atomistic simulations. The method is found to provide accurate
results with a gain of one order of magnitude in computational time. We then
investigate the role of the coarse-grained representation in affecting the
reliability of the method in the case of a
1,2-Dioleoyl-sn-glycero-3-phosphocholine membrane (DOPC). We find that the
accuracy of the results is tightly connected to the presence a good
configurational overlap between the coarse-grained and atomistic models---a
general requirement when developing multiscale simulation methods.Comment: 6 pages, 5 figure
Transferable atomic multipole machine learning models for small organic molecules
Accurate representation of the molecular electrostatic potential, which is
often expanded in distributed multipole moments, is crucial for an efficient
evaluation of intermolecular interactions. Here we introduce a machine learning
model for multipole coefficients of atom types H, C, O, N, S, F, and Cl in any
molecular conformation. The model is trained on quantum chemical results for
atoms in varying chemical environments drawn from thousands of organic
molecules. Multipoles in systems with neutral, cationic, and anionic molecular
charge states are treated with individual models. The models' predictive
accuracy and applicability are illustrated by evaluating intermolecular
interaction energies of nearly 1,000 dimers and the cohesive energy of the
benzene crystal.Comment: 11 pages, 6 figure
Hydration free energies from kernel-based machine learning: Compound-database bias
We consider the prediction of a basic thermodynamic property---hydration free
energies---across a large subset of the chemical space of small organic
molecules. Our in silico study is based on computer simulations at the
atomistic level with implicit solvent. We report on a kernel-based machine
learning approach that is inspired by recent work in learning electronic
properties, but differs in key aspects: The representation is averaged over
several conformers to account for the statistical ensemble. We also include an
atomic-decomposition ansatz, which we show offers significant added
transferability compared to molecular learning. Finally, we explore the
existence of severe biases from databases of experimental compounds. By
performing a combination of dimensionality reduction and cross-learning models,
we show that the rate of learning depends significantly on the breadth and
variety of the training dataset. Our study highlights the dangers of fitting
machine-learning models to databases of narrow chemical range.Comment: 10 pages, 7 figure
Controlled exploration of chemical space by machine learning of coarse-grained representations
The size of chemical compound space is too large to be probed exhaustively.
This leads high-throughput protocols to drastically subsample and results in
sparse and non-uniform datasets. Rather than arbitrarily selecting compounds,
we systematically explore chemical space according to the target property of
interest. We first perform importance sampling by introducing a Markov chain
Monte Carlo scheme across compounds. We then train an ML model on the sampled
data to expand the region of chemical space probed. Our boosting procedure
enhances the number of compounds by a factor 2 to 10, enabled by the ML model's
coarse-grained representation, which both simplifies the structure-property
relationship and reduces the size of chemical space. The ML model correctly
recovers linear relationships between transfer free energies. These linear
relationships correspond to features that are global to the dataset, marking
the region of chemical space up to which predictions are reliable---a more
robust alternative to the predictive variance. Bridging coarse-grained
simulations with ML gives rise to an unprecedented database of drug-membrane
insertion free energies for 1.3 million compounds.Comment: 9 pages, 5 figure
Reweighting non-equilibrium steady-state dynamics along collective variables
Computer simulations generate microscopic trajectories of complex systems at
a single thermodynamic state point. We recently introduced a Maximum Caliber
(MaxCal) approach for dynamical reweighting. Our approach mapped these
trajectories to a Markovian description on the configurational coordinates, and
reweighted path probabilities as a function of external forces. Trajectory
probabilities can be dynamically reweighted both from and to equilibrium or
non-equilibrium steady states. As the system's dimensionality increases, an
exhaustive description of the microtrajectories becomes prohibitive--even with
a Markovian assumption. Instead we reduce the dimensionality of the
configurational space to collective variables (CVs). Going from configurational
to CV space, we define local entropy productions derived from configurationally
averaged mean forces. The entropy production is shown to be a suitable
constraint on MaxCal for non-equilibrium steady states expressed as a function
of CVs. We test the reweighting procedure on two systems: a particle subject to
a two-dimensional potential and a coarse-grained peptide. Our CV-based MaxCal
approach expands dynamical reweighting to larger systems, for both static and
dynamical properties, and across a large range of driving forces.Comment: 12 pages, 7 figure
In silico screening of drug-membrane thermodynamics reveals linear relations between bulk partitioning and the potential of mean force
The partitioning of small molecules in cell membranes---a key parameter for
pharmaceutical applications---typically relies on experimentally-available bulk
partitioning coefficients. Computer simulations provide a structural resolution
of the insertion thermodynamics via the potential of mean force, but require
significant sampling at the atomistic level. Here, we introduce high-throughput
coarse-grained molecular dynamics simulations to screen thermodynamic
properties. This application of physics based models in a large-scale study of
small molecules establishes linear relationships between partitioning
coefficients and key features of the potential of mean force. This allows us to
predict the structure of the insertion from bulk experimental measurements for
more than 400,000 compounds. The potential of mean force hereby becomes an
easily accessible quantity---already recognized for its high predictability of
certain properties, e.g., passive permeation. Further, we demonstrate how
coarse graining helps reduce the size of chemical space, enabling a
hierarchical approach to screening small molecules.Comment: 8 pages, 6 figures. Typos fixed, minor correction
Computational compound screening of biomolecules and soft materials by molecular simulations
Decades of hardware, methodological, and algorithmic development have
propelled molecular dynamics (MD) simulations to the forefront of
materials-modeling techniques, bridging the gap between electronic-structure
theory and continuum methods. The physics-based approach makes MD appropriate
to study emergent phenomena, but simultaneously incurs significant
computational investment. This topical review explores the use of MD outside
the scope of individual systems, but rather considering many compounds. Such an
in silico screening approach makes MD amenable to establishing coveted
structure--property relationships. We specifically focus on biomolecules and
soft materials, characterized by the significant role of entropic contributions
and heterogeneous systems and scales. An account of the state of the art for
the implementation of an MD-based screening paradigm is described, including
automated force-field parametrization, system preparation, and efficient
sampling across both conformation and composition. Emphasis is placed on
machine-learning methods to enable MD-based screening. The resulting framework
enables the generation of compound--property databases and the use of advanced
statistical modeling to gather insight. The review further summarizes a number
of relevant applications.Comment: 48 pages, 3 figure
Coarse-grained conformational surface hopping: Methodology and transferability
Coarse-grained (CG) conformational surface hopping (SH) adapts the concept of
multisurface dynamics, initially developed to describe electronic transitions
in chemical reactions, to accurately describe classical molecular dynamics at a
reduced level. The SH scheme couples distinct conformational basins (states),
each described by its own force field (surface), resulting in a significant
improvement of the approximation to the many-body potential of mean force
[Phys. Rev. Lett. 121, 256002 (2018)]. The present study first describes CG SH
in more detail, through both a toy model and a three-bead model of hexane. We
further extend the methodology to non-bonded interactions and report its impact
on liquid properties. Finally, we investigate the transferability of the
surfaces to distinct systems and thermodynamic state points, through a simple
tuning of the state probabilities. In particular, applications to variations in
temperature and chemical composition show good agreement with reference
atomistic calculations, introducing a promising "weak-transferability regime,"
where CG force fields can be shared across thermodynamic and chemical
neighborhoods.Comment: 15 pages, 7 figure
Adversarial reverse mapping of condensed-phase molecular structures: Chemical transferability
Switching between different levels of resolution is essential for multiscale
modeling, but restoring details at higher resolution remains challenging. In
our previous study we have introduced deepBackmap: a deep neural-network-based
approach to reverse-map equilibrated molecular structures for condensed-phase
systems. Our method combines data-driven and physics-based aspects, leading to
high-quality reconstructed structures. In this work, we expand the scope of our
model and examine its chemical transferability. To this end, we train
deepBackmap solely on homogeneous molecular liquids of small molecules, and
apply it to a more challenging polymer melt. We augment the generator's
objective with different force-field-based terms as prior to regularize the
results. The best performing physical prior depends on whether we train for a
specific chemistry, or transfer our model. Our local environment representation
combined with the sequential reconstruction of fine-grained structures help
reach transferability of the learned correlations.Comment: 11 pages, 6 figures. arXiv admin note: text overlap with
arXiv:2003.0775
- …